منابع مشابه
Automatic Web Page Classification
Aim of this paper is to describe a method of automatic web page classification to semantic domains and its evaluation. The classification method exploits machine learning algorithms and several morphological as well as semantical text processing tools. In contrast to general text document classification, in the web document classification there are often problems with short web pages. In this p...
متن کاملAutomatic Web Page Classification
To facilitate user browsing of Web, some websites such as Yahoo! (http://dir.yahoo.com) and Open Directory Project (http://dmoz.org) manually maintain a hierarchical structure. While manual classification of web pages provides high accuracy, it is very expensive. To automatically include new emerging pages into these hierarchies, web page classification becomes a hot research topic in web infor...
متن کاملWeb Page Downloading and Classification
This paper describes the processes of downloading and classifying Web-based articles in online medical journals as a preliminary step to extracting bibliographic data to populate MEDLINE , the widely used database of the National Library of Medicine (NLM). The processes are combined to develop an automated system named “Web Page Downloading and Classification”. The system downloads the Web page...
متن کاملLarge-Scale Web Page Classification
This research investigates the design of a unified framework for the content-based classification of highly imbalanced hierarchical datasets, such as web directories. In an imbalanced dataset, the prior probability distribution of a category indicates the presence or absence of class imbalance. This may include the lack of positive training instances (rarity) or an overabundance of positive ins...
متن کاملWeb Page Classification: a Semantic Analysis
In this paper, a semantic analysis for Web page classification is presented. A set of Web pages, resulting from a simple query to a Web browser, is categorized by disambiguating the meaning of the term used for the search. The disambiguation process begins with the isolation of some outstanding paragraphs; linguistic markers are used to accomplish this task. The search term is located within th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer and Electrical Engineering
سال: 2014
ISSN: 1793-8163
DOI: 10.7763/ijcee.2014.v6.793